智能论文笔记

On the Automated Segmentation of Epicardial and Mediastinal Cardiac Adipose Tissues Using Classification Algorithms

Érick Oliveira Rodrigues , Felipe Fernandes Cordeiro de Morais , Aura Conci

分类：计算机视觉 | 机器学习

2022-08-30

对心脏周围环境的脂肪库的定量是评估与多种疾病相关的健康风险因素的准确程序。但是，由于人为的工作量，这种类型的评估并未在临床实践中广泛使用。这项工作提出了一种用于自动分割心脏脂肪垫的新技术。该技术基于将分类算法应用于心脏CT图像的分割。此外，我们广泛评估了几种算法在此任务上的性能，并讨论了提供了更好的预测模型。实验结果表明，心外膜和纵隔脂肪分类的平均准确性为98.4％，平均正面速率为96.2％。平均而言，关于分割的患者和地面真相的骰子相似性指数等于96.8％。因此，迄今为止，我们的技术已经获得了心脏脂肪自动分割的最准确结果。

translated by 谷歌翻译

A novel approach for the automated segmentation and volume quantification of cardiac fats on computed tomography

Érick Oliveira Rodrigues , FFC Morais , NAOS Morais , LS Conci , LV Neto , Aura Conci

分类：计算机视觉

2021-12-21

心脏周围环境的脂肪沉积与诸如动脉粥样硬化，颈动脉僵硬，冠状动脉钙化，心房颤动等许多健康风险因素相关。这些存款与肥胖有所不相关，这加强了其直接分割以进一步定量。然而，由于所需的人类工作量和医生和技术人员的后续高成本，这些脂肪的手动分割尚未在临床实践中被广泛部署。在这项工作中，我们提出了一种统一的方法，用于自主分割和两种类型的心脏脂肪量化。分段脂肪被称为心外膜和纵隔，并通过心包彼此分开。很多努力都致力于实现最小的用户干预。所提出的方法主要包括注册和分类算法以执行所需的分割。我们比较了多种分类算法对此任务的性能，包括神经网络，概率模型和决策树算法。所提出的方法的实验结果表明，心外膜和纵隔脂肪的平均准确性为98.5％（如果特征正常化，则为99.5％），其平均阳性率为98.0％。平均而言，骰子相似度指数等于97.6％。

translated by 谷歌翻译

Combining Minkowski and Chebyshev: New distance proposal and survey of distance metrics using k-nearest neighbours classifier

Érick Oliveira Rodrigues

分类：机器学习

2021-12-21

这项工作提出了一系列结合Minkowski和Chebyshev距离的距离，并且可以被视为中间距离。这种组合不仅在Z ^ 2中的邻域迭代任务中实现了高效的运行时间，而且在耦合与k最近邻居（k-nn）分类器时也获得良好的精度。所提出的距离比曼哈顿距离快约1.3倍，比离散邻域迭代的欧几里德距离快329.5倍。呈现了使用来自UCI存储库的总共33个数据集的K-NN分类器的精度分析，介绍了分配给k的15个数据集，从1到200分为k。在这个实验中，所提出的距离比其对应物更常见的距离优于平均值（在33例中的26例中），并且还更频繁地获得了最佳精度（在33例中有9例）。

translated by 谷歌翻译

Automated recognition of the pericardium contour on processed CT images using genetic algorithms

E. O. Rodrigues , L. O. Rodrigues , L. S. N. Oliveira , A. Conci , P. Liatsis

分类：计算机视觉 | 机器学习 | 神经与进化计算

2022-08-30

这项工作提出了使用遗传算法（GA）在追踪和识别使用计算机断层扫描（CT）图像的人心包轮廓的过程中。我们假设心包的每个切片都可以通过椭圆建模，椭圆形需要最佳地确定其参数。最佳椭圆将是紧随心包轮廓的紧密椭圆形，因此，将人心脏的心外膜和纵隔脂肪适当地分开。追踪和自动识别心包轮廓辅助药物的医学诊断。通常，由于所需的努力，此过程是手动完成或根本不完成的。此外，检测心包可能会改善先前提出的自动化方法，这些方法将与人心脏相关的两种类型的脂肪分开。这些脂肪的量化提供了重要的健康风险标记信息，因为它们与某些心血管病理的发展有关。最后，我们得出的结论是，GA在可行数量的处理时间内提供了令人满意的解决方案。

translated by 谷歌翻译

Robust Bayesian Subspace Identification for Small Data Sets

Alexandre Rodrigues Mesquita

分类： (统计)机器学习

2022-12-29

Model estimates obtained from traditional subspace identification methods may be subject to significant variance. This elevated variance is aggravated in the cases of large models or of a limited sample size. Common solutions to reduce the effect of variance are regularized estimators, shrinkage estimators and Bayesian estimation. In the current work we investigate the latter two solutions, which have not yet been applied to subspace identification. Our experimental results show that our proposed estimators may reduce the estimation risk up to $40\%$ of that of traditional subspace methods.

translated by 谷歌翻译

Anxolotl, an Anxiety Companion App -- Stress Detection

Nuno Gomes , Matilde Pato , Pedro Santos , André Lourenço , Lourenço Rodrigues

分类：机器学习

2022-12-28

Stress has a great effect on people's lives that can not be understated. While it can be good, since it helps humans to adapt to new and different situations, it can also be harmful when not dealt with properly, leading to chronic stress. The objective of this paper is developing a stress monitoring solution, that can be used in real life, while being able to tackle this challenge in a positive way. The SMILE data set was provided to team Anxolotl, and all it was needed was to develop a robust model. We developed a supervised learning model for classification in Python, presenting the final result of 64.1% in accuracy and a f1-score of 54.96%. The resulting solution stood the robustness test, presenting low variation between runs, which was a major point for it's possible integration in the Anxolotl app in the future.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Embedding generation for text classification of Brazilian Portuguese user reviews: from bag-of-words to transformers

Frederico Dias Souza , João Baptista de Oliveira e Souza Filho

分类：自然语言处理 | 人工智能

2022-12-01

Text classification is a natural language processing (NLP) task relevant to many commercial applications, like e-commerce and customer service. Naturally, classifying such excerpts accurately often represents a challenge, due to intrinsic language aspects, like irony and nuance. To accomplish this task, one must provide a robust numerical representation for documents, a process known as embedding. Embedding represents a key NLP field nowadays, having faced a significant advance in the last decade, especially after the introduction of the word-to-vector concept and the popularization of Deep Learning models for solving NLP tasks, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Transformer-based Language Models (TLMs). Despite the impressive achievements in this field, the literature coverage regarding generating embeddings for Brazilian Portuguese texts is scarce, especially when considering commercial user reviews. Therefore, this work aims to provide a comprehensive experimental study of embedding approaches targeting a binary sentiment classification of user reviews in Brazilian Portuguese. This study includes from classical (Bag-of-Words) to state-of-the-art (Transformer-based) NLP models. The methods are evaluated with five open-source databases with pre-defined data partitions made available in an open digital repository to encourage reproducibility. The Fine-tuned TLMs achieved the best results for all cases, being followed by the Feature-based TLM, LSTM, and CNN, with alternate ranks, depending on the database under analysis.

translated by 谷歌翻译

A Reinforcement Learning Approach to Optimize Available Network Bandwidth Utilization

Hasibul Jamil , Elvis Rodrigues , Jacob Goldverg , Tevfik Kosar

分类：人工智能

2022-11-22

Efficient data transfers over high-speed, long-distance shared networks require proper utilization of available network bandwidth. Using parallel TCP streams enables an application to utilize network parallelism and can improve transfer throughput; however, finding the optimum number of parallel TCP streams is challenging due to nondeterministic background traffic sharing the same network. Additionally, the non-stationary, multi-objectiveness, and partially-observable nature of network signals in the host systems add extra complexity in finding the current network condition. In this work, we present a novel approach to finding the optimum number of parallel TCP streams using deep reinforcement learning (RL). We devise a learning-based algorithm capable of generalizing different network conditions and utilizing the available network bandwidth intelligently. Contrary to rule-based heuristics that do not generalize well in unknown network scenarios, our RL-based solution can dynamically discover and adapt the parallel TCP stream numbers to maximize the network bandwidth utilization without congesting the network and ensure fairness among contending transfers. We extensively evaluated our RL-based algorithm's performance, comparing it with several state-of-the-art online optimization algorithms. The results show that our RL-based algorithm can find near-optimal solutions 40% faster while achieving up to 15% higher throughput. We also show that, unlike a greedy algorithm, our devised RL-based algorithm can avoid network congestion and fairly share the available network resources among contending transfers.

translated by 谷歌翻译

Causal Modeling of Soil Processes for Improved Generalization

Somya Sharma , Swati Sharma , Andy Neal , Sara Malvar , Eduardo Rodrigues , John Crawford , Emre Kiciman , Ranveer Chandra

分类：机器学习

2022-11-10

Measuring and monitoring soil organic carbon is critical for agricultural productivity and for addressing critical environmental problems. Soil organic carbon not only enriches nutrition in soil, but also has a gamut of co-benefits such as improving water storage and limiting physical erosion. Despite a litany of work in soil organic carbon estimation, current approaches do not generalize well across soil conditions and management practices. We empirically show that explicit modeling of cause-and-effect relationships among the soil processes improves the out-of-distribution generalizability of prediction models. We provide a comparative analysis of soil organic carbon estimation models where the skeleton is estimated using causal discovery methods. Our framework provide an average improvement of 81% in test mean squared error and 52% in test mean absolute error.

translated by 谷歌翻译